home *** CD-ROM | disk | FTP | other *** search
- ┌──────────────────────────────────────────────────────────────────────────────┐
- │ Previous notes: │
- │ -The author assume no responsability for any effects/damage │
- │ this file can produce. │
- │ -This file cannot be modified without the author's permission. │
- │ -This file can be (and must be !) freely distributed │
- └──────────────────────────────────────────────────────────────────────────────┘
-
-
- ================================================================================
- MORE ABOUT SHADING, Z-BUFFER, AND TEXTURES
- ================================================================================
-
-
- I heard not so long ago talking about method of shading called
- Gouraud shading, mainly based on the interpolation of the
- light intensity along the edge of the polygon and then along
- each scanline of it.
- The book in wich I read it was saying: "very approximative but
- very fast, this method is often used in real-time applications,
- because of its low cost in calculation time"(or sumthin like that..).
- After, I've seen it in many demos, and didn't understand how
- a so slow and uncomfortable algorithm could be used for animating
- thousand of vectors at a reasonnable frame-rate on my computer.
- It's only a few months ago I realized how fast and comfortable
- This method was and how many doors in the domain of 3d animation
- it opened to me.
- It's the principle of interpolating values along a polygon using
- fixed point math that is the key of many other algorithms,
- such as:
- - the degenerated Gouraud's: Z-gouraud, distance-based Gouraud
- - Phong shading
- - Z-buffer
- - texture mapping
- - many other more advanced techniques like:
- - environment/reflection mapping
- - bump mapping
- - A-buffer
- - and much more I expect you'll find and write to me !..
-
- ┌───────────────────────────────────┐
- │1) The first step: Gouraud shading │
- └───────────────────────────────────┘
-
- I won't repeat here what many other docs already told,
- but just make a brief recapitulation.
- Here's your polygon:
-
- I -> * At each vertice constituing the poly,
- I N1 you can compute a normal vector,
- /\ (preferably precalculated and rotated
- / \ like the other vertices).
- / \ using this normal vector, you can
- / \---> find the light intensity, which is
- / /-> simply the cosine of the angle between
- <---/ / N2 the normal vector and the light vector,
- -> \ / given by:
- N4 \ / -> ->
- \ / cos a = (N . L)/[L]*[N]
- \/
- I where N and L are respectively the
- I normal and the light vector and
- I -> [N] and [L] are the length of these two
- v N3 vectors.
-
- It's obvious that these two lengths should value 1.
- The intensity is then just the dot product of the two vectors.
-
- * The next step is to know, or just approximate, the intensity
- at each point of the poly. to do this, just IN-TER-PO-LA-TE !!
- For example: we just compute the intensity at each point of the
- poly nicely drawed above, you got for ex. the values a1 and a2
- corresponding to the two vectors N1 and N2.
- The value of the intensity at the point x,y of the first right
- edge is given by:
-
- a = a1 + (y - y1)*((a2 - a1)/(y2 - y1))
-
- where a1 and a2 are the intensity of the points 1 and 2
- y1 and y2 are the y-coord ON THE SCREEN of these two points.
-
- You got now all the values of the intensity along each edge
- of the poly. the second part of the algorithm concern
- the interpolation of these values along each scanline
- of the poly, thus obtaining the intensity at each pixel...
- This is done using the same scheme:
-
- a = a1 + (x - x1)*((a2 - a1)/(x2 - x1))
-
- where a1 and a2 are the intensity of the points 1 and 2
- x1 and x2 are the x-coord " " " "
-
-
- /\
- / \
- / \
- / \
- / \
- a1<__________>a2 <--- scanline
- / <--a--> \
- / \
- / \
- / \
- \ /
- \ /
- \ /
- \ /
- \ /
- \ /
- \ /
- \ /
- \ /
- \/
-
-
- Of coz it shouldn't be done using one MUL and one DIV per
- pixel ! The values (for ex. in the last formula) :
- (a2-a1) and (x2-x1) are precalculated for each scanline,
- and so is the value (a2-a1)/(x2-x1)...
- Then you set the first 'a' at the value a1 and the only
- operation you'll have to do for each pixel is just an addition
- between two fixed-point values. (a and (a2-a1)/(x2-x1))
-
- A brief example for the implementation :
- for each point of the scanline do:
-
- {asm code} {pseudo-code}
-
- MOV AL,DH ; al = a shr 8
- ADD DX,BP ; a = a+((a2-a1)/(x2-x1))
- STOSB ; es:[di] = a ; di = di+1
-
-
-
- 'GoT It ??
-
- ┌──────────────────────────────────────────┐
- │2) The next step : Z-Gouraud and Z-buffer │
- └──────────────────────────────────────────┘
-
- So now you got the code (how that 'not yet !!' ??) for
- interpolate one single value along a polygon.
-
- you can interpolate this famous intensity, as in the
- most classic Gourauds, but there is much more amusing:
- just approximate the Z component of each point of the
- polygon, using the same method.
-
- If you plot the intensity corresponding to this value
- you'll have some kind of 'depth shading' effect.
- (also known as 'Z-Gouraud', which is quite comprehensible!)
-
- Having the Z component of each point is very interesting
- when you have two faces intersecting each other.
- (f.e. when two objx are colliding !..)
-
- Just set up a huge (320*200 is good) buffer in your mem, where
- you put the Z-value of every pixel on the screen.
- when you try to write a pixel, first ask yourself:
- 'ain't there any point recovering the one I wanna write ?'
- In computer language, this is called a 'test'!:
- It just means : look if the Z-value of the current point
- is greater than the corresponding value in the Z_buffer.
- If it is, don't write it ! if not, you can write it
- on the screen (or in a screen-equivalent buffer..) and
- write its Z value in the Z-buffer !
-
- This method is -let's be reasonnable- very slow, it means
- one test per pixel, plus the refreshment of the Z_buffer
- (all points should be set to (-infiny) at every frame !),
- and a writing operation two times slower than the Gouraud
- shading one...
- But it is also the most realistic and impressive method
- for intersecting objects, Z-clipping (which can be done
- automatically using an unsigned test..),it allows
- immediate depth-shading, etc...
-
- If you didn't understood everything, these littal schemas
- are gonna help you:
-
- Z-buff: screen:
-
- ┌────────────┐ ┌────────────┐
- │ │ │ │ ░ poly #1
- │ │ │ │
- │ 1111 │ │ ░░░░ │ ▒ poly #2
- │ 2222 │ │ ░░░░ │
- │334444 │ │░░▒▒▒▒ │ █ poly #3
- │ 5555 │ │ ▒▒▒▒ │
- │ 2366766 │ │ ██▒▒█▒▒ │
- │ 23457 │ │ █████ │
- │ 23457 │ │ █████ │
- │ 23457 │ │ █████ │
- │ 23457 │ │ █████ │
- └────────────┘ └────────────┘
-
- The poly #3 is intersecting the poly #2, which is itself
- recovering the poly #1...
-
- WARNING !! these drawings assume that the Z-axis is
- pointing towards the observer.
- (the greater is the nearer..)
-
- And of coz, again a tiny example of the main loop
- implementation: (the rasterizer)
- (in 386 assembler, quite more easy than 8088..)
-
- @@:
- SHRD EBX,EDX,24 ; ebx = edx shr 8
- ADD EDX,EBP ; compute the Z value
- CMP BX,Z_BUFF[EDI*2] ; Z > z_buff[di] ??
- JG PLOT_IT ; yes ..
- INC EDI ; no: nothing
- LOOP @B
-
- JMP @F
-
- PLOT_IT:
- MOV SCREEN[EDI],AL ; store the color..
- MOV Z_BUFF[EDI*2],BX ; store the Z-value
- INC EDI ; next one!...
- LOOP @B
-
- @@:
-
- Again, don't forget to reverse the test if you are using another
- orientation..
-
- ┌────────────┐
- │3) Textures │
- └────────────┘
-
- So now you passed two months trying to understand this shit,
- you got it, nicely shaded polys rotating around each other and
- intersecting,and so on...
- But what about mapping?
- It just means put a bitmap onto a polygon, just as if
- the bitmap was itself rotating in the space.
- (I'm sure you've already seen that somewhere...)
-
- How to do it?
- we'll pass over the boring 'it's-not-possible-to-do-free-direction-
- texture-mapping-in-real-time-so-I'll-just-show-you-how-to-make-
- floors'
- real-time free direction texture-mapping IS possible and I'm
- gonna show you how...
-
- You know how to interpolate one value along a polygon:
- free distorsion/linear mapping , which are the true names
- of the method I use,only consists in interpolating
- TWO values instead of one.
- These two values are just the coordinates (UVs or IJs) of the point in
- the texture-map corresponding to the point on the poly.
-
- Well, just consider the poly we used in the Gouraud shading example:
-
- ┌───────────────────────────────────────────┬──────────────────────────────────┐
- │ iB,jB │ │
- │ /\ │<--- on the screen, it looks like:│
- │ / \ │ │
- │ / \ ├──────────────────────────────────┤
- │ / \ │ │
- │ i1,j1 / \ i2,j2 │ │
- │ <----------> <-- scanline│ in the texture, it looks like: │
- │ / <-i,j-> \ ├──────────────────────────────────┤
- │ / \ │ │
- │ / \ │ i1,j1 │
- │ / \ │ iA,jA <-------+-----> iB,jB │
- │ iA,jA \ / │ ┌───────────\──────┐ │
- │ \ / │ │ \ │ │
- │ \ / │ │ B I T -\ │ │
- │ \ / │ │ \ │ │
- │ \ / │ │ \ │ │
- │ \ / │ │ - M A P \ │ │
- │ \ / │ │ \+ i2,j2 │
- │ \ / │ │ │ │
- │ \ / │ │ │ │
- │ \/ │ └──────────────────┘ │
- │ │ │
- │ │ │
- │ │ │
- └───────────────────────────────────────────┴──────────────────────────────────┘
-
- As you can see, there are two interpolations, and, for each point, you
- got to find the offset corresponding to the two interpolated
- coordinates.
-
- The pseudo-code looks like:
-
- for each point on the scanline:
- { i=i+inc_i ;
- j=j+inc_j ;
- c=texture[i,j] ;
- plot (c) ;
- }
-
- The main difficulty is to code it in assembler, and using the less
- instructions as possible.
- I think it is possible to do it in six instructions per pixel.
- Has anybody got better ?
-
- code sample:
- (assuming a 256*X bit-map)
- ; EDX and ESI are the shifted coord.
- ; EBP and INC_J are the corresponding shifted increments.
- ; ECX is the nb. of points on the scanline
- ; EDI is the offset in the screen
-
- @@:
- MOV BX,SI ; bx <- l2 shl 8
- MOV BL,DH ; bx <- l2 shl 8 + l1
- MOV AL,TEXTURE[BX] ; get the pixel in the bitmap
- STOSB ; aff. le pixel
- ADD EDX,EBP ; incr. i
- ADD ESI,INC_J ; incr. j
- LOOP @B
-
- (I'll tell you later how to get rid of the 'INC_J' var...
- asm fans don't worry ! cf. tricks&tips: unrolled loops )
-
- Well, OK, this is not true texture-mapping (As I said, this
- is only free-disto), but the illusion works, and the method
- is fast.
- (the game DESCENT implements true texture-mapping, if
- there is a guy somewhere who knows how: please tell me !!)
-
- ┌──────────────────────────────┐
- │4) A word about Phong shading │
- └──────────────────────────────┘
-
- Well, I ain't gonna give here THE method for Phong-shade
- your polys, I don't know wich is the fastest one, but
- I am able to tell you some tricks I collected here and there..
- (Usenet discussions,f.e..).
- First you should know that many Phong that you've seen in
- demos or other are fakes, I mean they are just modified
- Gouraud shading. We can demonstrate that with a sufficient amount
- of polygons and using the Phong illumination model
- ( the shadings in the palette are non-linear ), the differences
- between G-shading and P-shading are very small (but they do exist..).
-
- The main principle is that instead of interpolating intensities, like
- in Gouraud, just interpolate normal vectors along your scanlines..
- It means three values to compute for every pixel (x,y and z coords).
- Then, when you've got your normal vector for each point, you gotta
- renormalise it,it means divide each coordinate by the current length
- of your interpolated vector. It must be done to compute properly the dot
- product between this vector and the light vector, then you've got your
- intensity !..
-
- A little drawing ?: x
- / light src
- /
- /
-
-
- ..... N ....
- ..... : ....
- N1 ... : .... N2
- ................:...............
- \ I /
- \ I /
- \ I /
- \____________I___________/
- <-- scanline
-
- N1 and N2 are the start-and-end vectors,
- N is the interpolated vector,
- the part of the N vector in ':' is the result of the renormalisation..
-
- As you can expect, this is impossible to compute in real-time..
- (using this algorithm, I mean!..)
- you've got to update 3 values (the coordinates of the
- interpolated vector), find the length, divide
- the coords by this length and then compute a dot product...
- It takes something like 3 ADDs, 3 DIVs and 6 MULs per pixel !!
-
- There are some speed-up tech.. I'm gonna make here a brief overview.
-
- First: don't compute every pixel: you can easily display three
- pixels of the same color together, or better: compute the true
- values every 4 pixels and interpolate between them...
- (I already told ya: 'interpolation' is THE word..)
-
- next: linear approximation is good, but quadratic is better !
- I've read a very interesting article in IMPHOBIA mag N°10,
- written by .. mhh .. don't remember .. (greetingz!),
- talking about QUAD-ADDERS : a very efficient method for
- interpolating between three values using a parabolic curve.
- It only takes 2 ADD's per pixel !
- you just have to compute three values on your scanline (with kewl
- MUL's and DIV's !!), and then interpolate (again!) between 'em.
- Just read this article, coz the development is too long for this txt...
-
- another one: 'FAST PHONG' by Mister Bishop (??), using taylor
- approximations... I didn't read it. :/
- see ACM computer graphics 20(4) pp.103-106 '1986
-
- Some other methods were developped, using angular or bi-angular
- interpolations.. (see f.e. 'faster Phong shading via angular interpo-
- -lation', Kujik & Blake, computer graphics forum 8 (1989)).
-
- Also think about spherical coordinates and look-up tables.. ;)
-
- ┌───────────────────┐
- │5) Tricks and tips │
- └───────────────────┘
-
- a)- fixed-point maths
- ---------------------
-
- For those who didn't understand how I was doing my incrementations,
- here's a brief recap of the fixed-point principle:
-
- We assume the values you're computing are made of two parts:
- the integer part and the decimal part, the one after the point.
- these two parts are contained in one number and you can decide
- which bits are significant for the int. or dec. part.
- It's very easy to implement : You just have to shift your numbers
- of n bits, meaning you multiply them by 2^n. When you want your
- integer approx., just 'de-shift' them by the right number of bits.
-
- It's even more easier when you shift your values of 8 bits.
- The 80X88 architecture allows you to split your regs in two
- 8-bits parts. So if you've got your shifted value in AX,
- the integer approximation will be contained in AH.
- If you work with 16 bits values, for ex. in AX, your shifted
- val should be in EAX, so that the extended part contains the
- integer.(You can even work with 24 bits values, easy to implement
- when smartly using the SHLD/SHRD 386 instr.)
-
-
- b)- single bytes suckz
- ----------------------
-
- All the main-loops examples I wrote only write one byte at each loop.
- It's much more efficient -when possible- to write two bytes (also
- called a word huhu!) at the same time.
- For example, here's the new Gouraud main loop :
- (5 inst. for two pixels!)
-
- SHR ECX,1
- @@:
- MOV AL,DH
- ADD EDX,EBP
- MOV AH,DH
- STOSW
- ADD EDX,EBP
- LOOP @B
-
- But warning ! This only works when you are writing at even offsets.
- Otherwise it won't speed up much, because when the CPU write a word on
- a byte border, it's as slow as writing two bytes 'manually'.
- So you'll have to check - before entering the loop - the parity
- of your first offset, first write one byte if it's odd, and the same
- when the loop is finished...
-
- c)- about the 'loop' instruction
- --------------------------------
-
- A word about the 'loop' instruction: I used it for clarity in my codes,
- but the fastest method for looping is the following:
-
- instead of: use:
- LOOP @B DEC ECX
- JNZ @B
-
- The result is the same but it works much more faster !
- (Perhaps no more on P5, ..must be verified..)
-
- d)- unrolled loops
- ------------------
-
- The last tip is fine but the real fastest method for looping
- is ... not to loop !
- Don't you guess ?
- The trick is to 'unroll' your loop, I mean when you know the
- maximum number of times you may loop (usually 320 for a scanline ,
- let's say MAX) you write all the instruction one after one.
- When you enter your loop just jump at the offset LABEL+(MAX-ECX)*N,
- where N is the number of bytes of the instr. constituing your loop.
- Of coz you gotta use the 'REPEAT MAX (...) ENDM' macro-inst for
- writing the code, which induces a space loss, but it's really
- worth it.
- For the Gouraud:
-
- JMP ... ; do it yourself, big boy !
-
- REPEAT 320
- MOV AL,DH ; al = a shr 8
- ADD DX,BP ; a = a+((a2-a1)/(x2-x1))
- STOSB ; es:[di] = a ; di = di+1
- ENDM
-
- I said when talking about the texture mapping that it was possible
- to get rid of the var : just replace it by ECX which is not used
- anymore in the loop !
-
- ***** last minute *****
- It seems that the unrolled loops are not so efficient
- -if not less- on the new pipe-lined/"RISC" architecture of the Pentium..
- I am not a hardware freak, but so many people told me that
- I think I just have to believe it !
- It's your matter to know wich config your prodz should work on..
- And one way to know is TESTING.
-
- e)- Z-buffer optimizations
- --------------------------
-
- Yes, there are many !
- DO you know what a depth-sort is ?
- Well, it's just an implementation of the famous painter's algorithm:
- sort your polygons 'back-to-front', so that the nearest polys are
- recovering the furthest.
- For Z-buffer, just reverse : sort your polys 'front-to-back',
- so that -with a bit of luck- you just write the necessary pixels
- on the screen, and not the one which are going to be recovered.
-
- John De Goes - creditz ! - told me some other trickz too:
- f.e:
- - you normally don't have to clear your Z-buffer every frame
- - in some special cases, you don't have to test every pixel
- on your scanline
- - You should be careful too on the Pentium branch-prediction
- chip(see 'last-minute' above): if most of the pixel of the
- poly will be written,execute the jump only if the current
- point is not written..
- - I heard talking about 'fuzzy' algorithms wich may be helpful
- (just heard..)
-
- (see 'Zbuffinf.txt', kewl doc)
-
- ┌────────────────────┐
- │6)- conclusion words│
- └────────────────────┘
-
- I hope this doc will be helpful for beginners in 3d coding.
- It's the kind of help I would have appreciated when I was
- beginning one year ago.
- It perhaps seems a bit confused and the level of each part
- may be variable, but I didn't write it in one time, so
- if some explanations are not clear, don't hesitate to ask me..
- (Sorry for the bad english. I speak french, so be indulgent..)
-
- I still have much to do in 3d coding:
- - implement a TRUE texture-mapping algo,
- not a linear distorsion.
- - I will probably releaze a txt about Phong shading,
- including a (more or less) efficient implementation.
- - BSP trees and/or Z-buffer for a 3d virtual world.
- - The use of extended VESA/SVGA graphic modes such as
- Hi-color or 8bits-Hi-rez modes.
- - Pentium optimizations trix.
- any help is appreciated..!
-
- ┌──────────────────────────┐
- │7)- Credits and greetingz │
- └──────────────────────────┘
-
- Doc written by :
- ----------------
- KARMA [ Tfl/TdV = The Flamoots/The Dark Vision ]
- (Jean Cardinal)
- CONTACT ME !!
- for anything: discussion,advices,remarks,flaming,collaboration...
- E-mail:
- jcardin@is1.ulb.ac.be
-
- Thanx &/or greetingz to:
- ------------------------
-
- MORFLAME [TfL/TdV]
- -we gonna do good work together!
-
- JOHN DE GOES [on Compuserve]
- -Cool docs !
-
- ALL THE MEMBS OF MY CREW :
- type one,sam,cybersurfer,
- fred,bismarck,gopi,fly,zoltan,Rod...
-
- IMPHOBIA
- -'nice' is the word
-
- and everyone I forgot...
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-